slanted W3C logo

APC Error detection and reporting in OpenAPC

Christoph Broschinski





Overview: OpenAPC data collections

  1. Core data file: data/apc_de.csv
    • Original OpenAPC collection, started in 2015
    • Contains OA articles with associated APC costs
    • Currently 84,412 articles (v3.69.2) contributed by 247 institutions
    • Aggregated APC costs: 163,704,001 €

Overview: OpenAPC data collections (2)

  1. Transformative Agreements data file (formerly: Offsetting data file): data/transformative_agreements/transformative_agreements.csv
    • Started in 2016
    • Contains OA articles based on alternative accounting models
    • No APC costs linked to articles
    • Currently 31,701 articles (v3.69.2) contributed by 205 institutions
    • Article data from 12 different transformative agreements (but mostly Springer Compact)

OpenAPC Error Checking

  1. The whole OpenAPC data set is automatically tested for errors whenever new articles are integrated, this includes:
    • Identifier checks (DOIs/ISSNs, via regular expressions)
    • Searching for DOI duplicates
    • Name consistency checks (publisher names/journal titles)
    • Logical tests (Journals in DOAJ cannot be hybrid)
    • ...
institutionperiodeurodoijournal_full_title
University of Cambridge2014-65.1610.1099/vir.0.071365-0Journal of General Virology
LSHTM20141190010.3402/gha.v7.23621Global Health Action
University of Southampton201625069.5410.1017/s0144686x16001057Ageing and Society

(Solution to 2 and 3: Additional Zero added (1190€), decimal point shifted (2506.94€))

APC amounts: Statistical Analysis

Example: Frontiers in Psychology (1149 articles, Mean: 1610€, SD: 534€)

-2*SD-SDMean+SD+2*SD
€ 542€ 1076€ 1610€ 2144€ 2678

Reporting Results

Approach: Automated report generation

Last slide

Thanks for your attention!

Find this presentation at https://bit.ly/2oTbfjZ