Working with batches of PDF files
Programming Historian
Article
Bibliographic details for Working with batches of PDF files
Learn how to perform OCR and text extraction with free command line tools like Tesseract and Poppler and how to get an overview of large numbers of PDF documents using topic modeling.
Back to topCitation
BibTeX citation:
@article{mähr2020,
author = {Mähr, Moritz},
title = {Working with Batches of {PDF} Files},
journal = {Programming Historian},
volume = {9},
date = {2020-01-01},
url = {https://moritzmaehr.ch/publications/mahr2020i.html},
doi = {10.46430/phen0088},
langid = {en}
}
For attribution, please cite this work as:
Mähr, Moritz. 2020. “Working with Batches of PDF Files.”
Programming Historian 9 (January). https://doi.org/10.46430/phen0088.