An estimate of the consensus project paper search coverage

As part of my involvement in the consensus project (TCP) that recently published its results, I looked into some aspects of the data which were not part of the final paper. One thing I did was look into what proportion of the literature was covered by the project.

Here's the description of the search from the paper:

"In March 2012, we searched the ISI Web of Science for papers published from 1991–2011 using topic searches for ‘global warming’ or ‘global climate change’. Article type was restricted to ‘article’, excluding books, discussions, proceedings papers and other document types. The search was updated in May 2012 with papers added to the Web of Science up to that date."

This resulted in 12,465 papers, but after eliminating papers that were non-peer-reviewed, not related to climate, and papers without abstracts, the resulting number of papers was 11,944.

In order to check the completeness of the search, we should compare the search results to some other known sample. To me, the obvious sample for comparison is found from IPCC fourth assessment report (AR4) reference lists because I think they cover the subject reasonably well. However, it should be noted that IPCC reference lists don't contain all the papers on the subject, but they are only a subset just like the sample in TCP. The comparison between the TCP sample and IPCC reference lists presented below only shows if TCP paper search did not cover the subject well.

I made a cross-comparison between the TCP sample and the reference lists of AR4. I didn't go over all AR4 chapters, though, but only few selected ones. In TCP, papers were categorized to different subject areas, so I took equivalent chapters for each subject area from AR4 for comparison. For methods, I selected Working group I chapter 9 "Understanding and Attributing Climate Change". Obvious choice for paleoclimate was Working group I chapter 6 "Palaeoclimate". I used Working group II chapter 1 "Assessment of observed changes and responses in natural and managed systems" for impacts. For mitigation, I used Working group III chapter 7 "Industry" and chapter 8 "Agriculture" (two chapters in order to keep somewhat similar paper count to other comparisons).

The reference lists of AR4 chapters have some entries that are out of scope for TCP. Such entries are non-peer-reviewed documents (for example reports, websites, books), comment papers (comments and replies), and papers out of TCP timeframe (1991-2012). These were excluded from the comparison and from AR4 paper count. Table below shows the results of the comparison. Rows of the table are: category in TCP, equivalent AR4 chapter(s), number of relevant entries in AR4, number of papers found in TCP of the relevant AR4 entries, and the coverage percentage (and its standard error) of TCP paper search. Last column gives the overall estimate calculated by adding all the four different subject areas together.

Category methods paleoclimate impacts mitigation overall
AR4 chapter WG1 CH9 WG1 CH6 WG2 CH1 WG3 CH7+8 overall
Paper count 493 542 570 318 1923
Found 53 26 68 21 168
Coverage % 11 ± 2 5 ± 1 12 ± 2 7 ± 2 8.7 ± 0.7

The overall coverage percentage is estimated to be 8.7 %, which means that to acheive complete coverage we would have to have looked at 140,000 papers! The search found more papers from methods and impacts subject areas. At least for methods (containing basic climate science papers) this is expected, but it would make more sense if methods papers would have larger coverage percentage than impacts papers. But it may be that the authors of impacts papers are more inclined to mention the phenomenon causing the impacts they are studying, whereas for climate scientists it might go without saying that they are studying climate change related issues and they are perhaps more concentrated on studying the little details of the issue.

Somewhat surprising is low coverage percentage for paleoclimate papers. Perhaps they just don’t mention global climate change or global warming that much. Mitigation is a bit higher than paleoclimate, which might be understandable as mitigation is done because of global warming, so mitigation papers are perhaps expected to use the term more often than paleoclimate papers. It also makes some sense that impacts papers have higher coverage than mitigation papers, but I didn’t expect the gap between them to be this large (before seeing the numbers I considered them as somewhat equal in this sense).

The results of this comparison can also be used in other way. We can use the numbers above to estimate the total number of all published global climate change and global warming related papers between 1991 and 2012. The reasoning is this: TCP paper search found 8.7 % of papers referenced in AR4. Total paper count from TCP paper search is 11,944. If 11944 represents 8.7 % of papers, then total number of papers must be (11,944 * 100 %) / (8.7 %) = 136,693.

I calculated numbers similarly for all subject areas. Additionally, I estimated total numbers for endorsement and rejection papers between 1991 and 2012. Table below shows the results of these. Rows of the table are: category, number of papers in TCP in the category, coverage % calculated above, estimated total number of papers as described above, number of endorsement papers in TCP in the category, estimated total number of endorsement papers, number of rejection papers in TCP in the category, estimated total number of rejection papers.

Category methods paleoclimate impacts mitigation overall
TCP paper count 1991 785 5780 3386 11942
Coverage % 11 5 12 7 8.7
Total paper count 18520 16364 48450 51274 136693
TCP endorsements 581 169 1235 1912 3897
Total endorsements 5404 3523 10352 28953 44607
TCP rejections 53 4 17 3 77
Total rejections 493 83 143 45 881

As mentioned above, the estimate for all published papers relating to global climate change and global warming is about 137,000 between 1991 and 2012. Reading one of them each day would take 375 years. If you would be a climate scientist wanting to read all papers relating to global climate change and global warming (in order to keep up with the trade), and if your career would last, say, 50 years, you would need to read 7 papers each day. That’s doable, isn’t it?

As we all know, there are lots and lots more endorsement papers than rejection papers, but interesting thing here might be the total number of rejection papers, which I estimate to be 881. There are actually quite many of them. Perhaps one of them is the one that turns around the whole of climate science. I wouldn’t bet on it, though.

Posted by Ari Jokimäki on Monday, 10 June, 2013


Creative Commons License The Skeptical Science website by Skeptical Science is licensed under a Creative Commons Attribution 3.0 Unported License.