Why do paper and citation counts vary so much year on year?
Late in development of the 2007 rankings, this ranking project moved from using the Essential Science Indicators (ESI) from Thomson Reuters to Scopus from Elsevier. A bibliometric database of this size is vastly complex and given the file sizes involved, processing can take days. Much improvement has been made to the algorithms used to retrieve data between 2007 and 2008, eliminating double counting between disciplines and double counting between different affiliations that are ultimately attributed to the same intitution. In many cases, this has resulted in dramatically reduced overall paper and citation counts in 2008 (and beyond). The dataset overall, however, carries a correlation co-efficient of 0.96 with 2007 and 0.93 with 2006 (where data from ESI were used). As a result, whilst the raw data may be quite different, the final scores ought to be fairly similar, depending on changes in staffing levels.