How do I extract text from PDF proofs if I want to produce a Word file matching the pagination of the proofs to allow me to use Word’s indexing software?
When PDF files are prepared in the correct way, it is possible to extract text from them by using the Select All command, copying, and then pasting into Word. However the pagination of the PDF files will not be retained, while all header and footer text (e.g. running heads) will, so you will need carefully to insert page breaks in exactly the same places as they appear in the PDF file, and extract header/footer wording to ensure that any indexed word they contain doesn’t get collected in the index.