Important Notice: Our web hosting provider recently started charging us for additional visits, which was unexpected. In response, we're seeking donations. Depending on the situation, we may explore different monetization options for our Community and Expert Contributors. It's crucial to provide more returns for their expertise and offer more Expert Validated Answers or AI Validated Answers. Learn more about our hosting issue here.

How do I extract text from PDF proofs if I want to produce a Word file matching the pagination of the proofs to allow me to use Word’s indexing software?

April 26, 2017Extract file matching pagination PDF proofs text word

0

Posted

How do I extract text from PDF proofs if I want to produce a Word file matching the pagination of the proofs to allow me to use Word’s indexing software?

1 Answer

0

Posted

When PDF files are prepared in the correct way, it is possible to extract text from them by using the Select All command, copying, and then pasting into Word. However the pagination of the PDF files will not be retained, while all header and footer text (e.g. running heads) will, so you will need carefully to insert page breaks in exactly the same places as they appear in the PDF file, and extract header/footer wording to ensure that any indexed word they contain doesn’t get collected in the index.