ArXiv and the REF open access policy

Reading time: 6 minute
...

📝 Original Info

  • Title: ArXiv and the REF open access policy
  • ArXiv ID: 1804.06648
  • Date: 2018-04-20
  • Authors: 원문에 저자 정보가 제공되지 않았습니다.

📝 Abstract

HEFCE's Policy for open access in the post-2014 Research Excellence Framework states "authors' outputs must have been deposited in an institutional or subject repository". There is no definition of a subject repository in the policy: however, there is a footnote stating: "Individuals depositing their outputs in a subject repository are advised to ensure that their chosen repository meets the requirements set out in this policy." The longest standing subject repository (or repository of any kind) is arXiv.org, established in 1991. arXiv is an open access repository of scientific research available to authors and researchers worldwide and acts as a scholarly communications forum informed and guided by scientists. Content held on arXiv is free to the end user and researchers can deposit their content freely. As of April 2018, arXiv held over 1,377,000 eprints. In some disciplines arXiv is considered essential to the sharing and publication of research. The HEFCE requirements on repositories are defined in the Information and Audit Requirements which lists the "Accepted date", the "Version of deposited file" and "available open access immediately after the publisher embargo" are expected as part of the REF submission. However, while many records in arXiv have multiple versions of work, the Author's Accepted Manuscript is not identified and there is no field to record the acceptance date of the work. Because arXiv does not capture these two specific information points it does not meet the technical requirements to be a compliant subject repository for the purposes of REF. This paper is presenting the case that articles deposited to arXiv are, in general, compliant with the requirements of the HEFCE policy. The paper summarises some work undertaken by Jisc to establish if there are other factors that can indicate the likelihood of formal compliance to the HEFCE policy.

💡 Deep Analysis

Figure 1

📄 Full Content

The Higher Education Funding Council for England (HEFCE)'s Policy for open access in the post-2014 Research Excellence Framework states "authors' outputs must have been deposited in an institutional or subject repository". (HEFCE, 2015) There is no definition of a subject repository in the policy: however, there is a footnote stating: "Individuals depositing their outputs in a subject repository are advised to ensure that their chosen repository meets the requirements set out in this policy."

The longest standing subject repository (or repository of any kind) is arXiv.org, established in 1991 by Paul Ginsparg at the Los Alamos Laboratory. It is currently hosted at Cornell University. arXiv is an open access repository of scientific research available to authors and researchers worldwide and acts as a scholarly communications forum informed and guided by scientists. Content held on arXiv is free to the end user and researchers can deposit their content freely.

As of April 2018, arXiv held over 1,377,000 e-prints in Physics, Mathematics, Computer Science, Quantitative Biology, Quantitative Finance, Statistics, Electrical Engineering and Systems Science, and Economics. In some disciplines arXiv is considered essential to the sharing and publication of research.

The HEFCE requirements on repositories are defined in the Information and Audit Requirements (HEFCE, 2014) which lists the “Accepted date”, the “Version of deposited file” and “available open access immediately after the publisher embargo” are expected as part of the REF submission.

This means there are two barriers to using the papers deposited in arXiv for compliance with REF’s open access policy:

• Many records in arXiv have multiple versions of work attached, however these are classified as Version 1, Version 2 rather than using the NISO terminology widely accepted in research, of Submitted Version, Author’s Accepted Manuscript and Version of Record (NISO, 2008). • There is no field to record the acceptance date of the work. The HEFCE policy requires this information because compliance is tied to a deposit within a three month period from acceptance.

Because arXiv does not capture these two specific information points it does not meet the technical requirements to be a compliant subject repository for the purposes of REF.

During 2015-2016 a significant effort was made between the UK Higher Education sector and arXiv to negotiate undertaking the necessary development work to add these data fields to arXiv records. Technical specifications were agreed and in principle agreement for funding of the work was secured. However personnel challenges resulted in an inability for this project to meet the HEFCE policy start date deadline of 1 April 2016.

This paper is presenting the case that articles deposited to arXiv are, in general, compliant with the requirements of the HEFCE policy. The paper summarises some work undertaken by Jisc to establish if there are other factors that can indicate the likelihood of formal compliance to the HEFCE policy.

  1. Study 1 -Estimating the formal compliance of articles on arXiv with REF policy

For a work to be formally compliant with the HEFCE policy, it must:

  1. Have metadata deposited in repository within three months of acceptance 2. Have author accepted manuscript or published version deposited within three months of acceptance 3. Be made open access immediately after the publisher embargo It is not possible to assess what percentage of works on arXiv formally comply with REF due to the following missing information:

  2. Acceptance date 2. Information about version available on arXiv Instead, three factors were studied that could indicate the likelihood of formal compliance: presence of a DOI, date of metadata deposit, and date of last update in arXiv.

A study was undertaken by Jisc in 2016 considering a sample of articles available in arXiv taken from 2011-2015 to establish whether it is possible to determine which version of the work is uploaded and when. The arXiv API was used to download metadata in XML format for a sample of 1200 articles submitted to arXiv between 2011 and 2016. The XML was parsed and, where a DOI was present, the Crossref API was called on the DOI and the metadata was joined with arXiv. Basic analysis was then performed using Python scripts and Excel.

The presence of a DOI is the most basic element of compliance, since without it, there is no sure way of identifying the published article corresponding to the pre-print. In addition, CrossRef can be used to supplement the metadata about the article for a more complete bibliographic record.

Digital Object Identifiers (DOI) are allocated to articles by publishers. Records within arXiv that contain a DOI indicate the work has been published and the author has updated the record to include the DOI. Roughly half of the sample of articles (53%) listed a DOI. The presence of a DOI is the most basic element of compliance, since without it, th

📸 Image Gallery

cover.png

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut