Tuesday, November 29, 2016

IMF data on the world economy: what a difference a data version makes

Data sources are regularly updated. Users typically assume that this means that new, more recent data are added and that errors are corrected. Newer data are better. But are they?  And what are the implications for replication? This guest blog of the replication network points out challenges and potential benefits of the existence of different data versions.

Variations between the different vintages of a data set are not necessarily problematic. Variations provide insight in the measurement error in the data source. A better understanding of measurement error may be helpful for establishing why a replication fails or succeeds. Moreover, performing replications over many different vintages can support the robustness of the original study’s findings. If all the data versions arrive at the same conclusion, this strengthens confidence in the replication’s verdict on the original study (be it positive or negative). It is not the difference of the data version that matters, but the similarity of findings across different data versions. As a result, different data versions can be turned into an important asset for replication research.