Wednesday, November 5, 2014

Release of a daily benchmark dataset - version 1

The ISTI benchmark working group includes a PhD student looking at benchmarking daily temperature homogenisation algorithms. This largely follows the concepts laid out in the benchmark working group's publication. Significant progress has been made in this field. This post announces the release of a small daily benchmark dataset focusing on four regions in North America. These regions can be seen in Figure 1. 

Figure 1 Station locations of the four benchmark regions. Blue stations are in all worlds. Red stations only appear in worlds 2 and 3.

These benchmarks have similar aims to the global benchmarks that are currently being produced by the ISTI working group, namely to:
 


  1. Assess the performance of current homogenisation algorithms and provide feedback to allow for their improvement 
  2. Assess how realistic the created benchmarks are, to allow for improvements in future iterations 
  3. Quantify the uncertainty that is present in data due to inhomogeneities both before and after homogenisation algorithms have been run on them

A perfect algorithm would return the inhomogeneous data to their clean form – correctly identifying the size and location of the inhomogeneities and adjusting the series accordingly. The inhomogeneities that have been added will not be made known to the testers until the completion of the assessment cycle – mid 2015. This is to ensure that the study is as fair as possible with no testers having prior knowledge of the added inhomogeneities.

The data are formed into three worlds, each consisting of the four regions shown in Figure 1. World 1 is the smallest and contains only those stations shown in blue in Figure 1, Worlds 2 and 3 are the same size as each other and contain all the stations shown.

Homogenisers are requested to prioritise running their algorithms on a single region across worlds instead of on all regions in a single world. This will hopefully maximise the usefulness of this study in assessing the strengths and weaknesses of the process. The order of prioritisation for the regions is Wyoming, South East, North East and finally the South West.

This study will be more effective the more participants it has and if you are interested in participating please contact Rachel Warren (rw307 AT exeter.ac.uk). The results will form part of a PhD thesis and therefore it is requested that they are returned no later than Friday 12th December 2014. However, interested parties who are unable to meet this deadline are also encouraged to contact Rachel.

There will be a further smaller release in the next week that is just focussed on Wyoming and will explore climate characteristics of data instead of just focusing on inhomogeneity characteristics.

No comments:

Post a Comment