Skip to main content

DOI Extraction

1. Overview and Context

This isprocedure the final check of the PDF beforeextracts the DOIs are extracted from it.a PDF by means of a custom Python script.

2. Triggers

The execution of this procedure is usually triggered by

3. Steps to Be Performed

  • CheckSave ona emptycopy pagesof the DOI Extraction Script to your hard drive as doiextract.py
  • Move the PDF to the same folder
  • Open the script file in a code editor
  • Change pdf_path to the file name of the PDF (incl. .pdf)
  • Open a command line tool like Terminal
  • Navigate to the folder containing the script and the PDF
  • Run python3 doiextract.py
  • The script will generate a CSV file of the PDF file suffixed with headers
  • Check that the colophon is complete
  • Check that there are 4-6 empty pages at the end
  • Check that there is no left-over highlighting
  • Check that all headers are correct and all pages that need them have them_dois.csv

4. Additional Information

5. Document Control

Document ID PRO-002003
Document Owner Vincent
Version 1.0
Last Date of Change October 2, 2025
Next Review Due Date
Version & Change Tracking