Springer Compact coverage analysis

Christoph Broschinski


OpenAPC maintains two different data collections:

  1. Core data file
    • Original OpenAPC collection, started in 2015
    • Contains OA articles with associated APC costs
    • More than 72.000 articles (01/2019)
  2. Offsetting data file
    • Started in 2016 on behalf of ESAC at the MPDL
    • Contains OA articles based on alternative accounting models
    • No APC costs linked to articles
    • More than 22.000 articles (01/2019)

Offsetting data mostly (98.5%) consists of articles published under the Springer Compact agreements (SCA)

SCA: OA publishing in Springer hybrid journals for authors at selected institutions with no additional costs (+ read access)

SCA consortial partners report article data to OpenAPC on a regular basis:

Offsetting Coverage

OpenAPC side project Offsetting Coverage: Is it possible to calculate the effects of the SCAs on OA publishing numbers?

Offsetting Coverage (2)


  1. List of all Springer journals covered by the agreements
    • Easy: Can be directly obtained as CSV file from the Springer site
  2. Publication metrics (article lists + total number of articles/oa articles per year) for all journals contained in 1.
    • Difficult, as with all bibliometric evaluations (at least if meant to be complete)
    • First approach: Crossref
    • Didn't work well due to incompatible time frames (date of approval/acceptance/online publication/print publication...) and reporting delays
    • Second approach: Obtain data directly from the SpringerLink portal for every journal
    • Did work, but is tricky (webscraping required)
    • Springer API might be a better option, but not evaluated yet
  • All data is fetched and combined auomatically using a script, results are presented in a special treemap view
  • Offsetting Coverage: results

    Aggregated numbers for all hybrid Springer journals eligible for SCA:

    periodnumber of journalstotal articlesOA articlesOA shareoffsetting articlesOA share covered by offsetting
    2018 (still incomplete!)1,906259,16322,7948.80%799235.06%

    Key insights:

    Simulated Offsetting Coverage

    With the offsetting coverage project established, another question came up: How would the participation of another consortial partner (Germany) have influenced the numbers?

    Using the Web of Science, the KOA bibliometrics group prepared a data set (~17,500 articles) based on the following filter parameters:

    OpenAPC then conducted several preprocessing steps, reducing the data some more (to ~9,400 articles):

    The data was then merged with the original offsetting data to create a new, simulated data set.

    The treemap development server provides a custom view.

    Simulated Offsetting Coverage: Results

    Comparison between real and simulated coverage data:

    perioddata typenumber of journalstotal articlesOA articlesOA shareoffsetting articlesOA share covered by offsetting


    1. A lot of articles in the simulated data were published in journals dedicated to medical case studies (in German language).
      • Authors are usually medical pracitioners not necessarily working at academic teaching hospitals. It is thus questionable if these articles would be covered by a real-world SCA.
      • Consequence: "Correct" OA numbers for simulated data set probably lower.
    2. About 600 articles had no DOI and could not be processed.
      • Consequence: "Correct" OA numbers for simulated data set probably higher.
    3. About 100 articles published within former Nature journals were ignored (journals not listed on SpringerLink)
      • Consequence: "Correct" OA numbers for simulated data set probably higher.

    Key insights:

