Working with batches of PDF files

Programming Historian

Article
Bibliographic details for Working with batches of PDF files
Author

Moritz Mähr

Published

2020

Doi

Learn how to perform OCR and text extraction with free command line tools like Tesseract and Poppler and how to get an overview of large numbers of PDF documents using topic modeling.

Back to top

Citation

BibTeX citation:
@article{mähr2020,
  author = {Mähr, Moritz},
  title = {Working with Batches of {PDF} Files},
  journal = {Programming Historian},
  volume = {9},
  date = {2020-01-01},
  url = {https://moritzmaehr.ch/publications/mahr2020i.html},
  doi = {10.46430/phen0088},
  langid = {en}
}
For attribution, please cite this work as:
Mähr, Moritz. 2020. “Working with Batches of PDF Files.” Programming Historian 9 (January). https://doi.org/10.46430/phen0088.