PPRuNe Forums - View Single Post - PDF to TXT conversion
View Single Post
Old 26th Apr 2009, 17:52
  #1 (permalink)  
ChristiaanJ
 
Join Date: Jan 2005
Location: France
Posts: 2,315
Likes: 0
Received 0 Likes on 0 Posts
PDF to TXT conversion

Most of current-day .PDF files have been created from computer-readable files (.DOC, .RTF, etc.), and most of the latest Acrobat Readers let you select and copy text (and even images) from such files, to quote elsewhere, for instance.

Some older .PDF files are simply PDF-compressed copies of scans (hence bit maps) of old documents, and the "select and copy text" function of Acrobat Reader no longer works, even though the documents are text, and the Acrobat Reader "select and copy text" function seems to be based on some kind of OCR.

Does anybody here know anything about tools to extract a text file from such an ancient .PDF file?
(Short, obviously, of printing, scanning, using an OCR program, and cleaning up afterwards.)

CJ
ChristiaanJ is offline