Is it possible to parse an existing PDF-document and convert it to another format (HTML, DOC, EXCEL)?
No, the pdf format is just a canvas where text and graphics are placed without any structure information. As such there aren’t any ‘iText-objects’ in a PDF file. For instance: you can’t retrieve a table object from a PDF file. Tables are formed by placing text and lines at selected places. So Im still looking for some way to maybe read a line in a PDF doc, and parse it. Any sugestions to what other PDF parsing packages I could use, would be very apreciated.