Notices
Computer/Internet Issues & Troubleshooting Anyone with questions about the terribly complex world of computers or the internet should try here. NOT FOR REPORTING ISSUES WITH PPRuNe FORUMS! Please use the subforum "PPRuNe Problems or Queries."

Extracting info for html page

Old 8th December 2009 | 08:35
  #1 (permalink)  
Thread Starter
Per Ardua ad Astraeus
 
Joined: Mar 2000
Posts: 18,575
Likes: 4
From: UK
Extracting info for html page

I am supplied with a wide variety of inputs for news on a village website.

First problem is extracting text from a MS Pub file. Html formatted text then goes into tables on a prepared page using my style sheets. I cannot get 'Save as html' to allow insertion of the code into tables, and each page generates over 2000 NEW lines of css style in true MS fashion. At the moment I am transcribing longhand into html - any tricks?

Second one is extracting an image from a PDF file. I have tried various progs (I have Acrobat) but the resulting colour is not true. Resorting to jpg screensahots at this time.
BOAC is offline  
Reply
Old 8th December 2009 | 10:15
  #2 (permalink)  
 
Joined: Jul 2003
Posts: 852
Likes: 3
From: Brum
If you have access to a Mac, FileJuicer will extract all text/jpegs etc. from almost any file...

FileJuicer

Nige
Nige321 is offline  
Reply
Old 8th December 2009 | 11:14
  #3 (permalink)  
Thread Starter
Per Ardua ad Astraeus
 
Joined: Mar 2000
Posts: 18,575
Likes: 4
From: UK
Ta, but 'Macless in Gaza'!
BOAC is offline  
Reply
Old 8th December 2009 | 14:30
  #4 (permalink)  
Thread Starter
Per Ardua ad Astraeus
 
Joined: Mar 2000
Posts: 18,575
Likes: 4
From: UK
TSC- thanks for reply - para by para

1- See post #1 - colour wrong
2- Yes for that society, that's what they are used to
3- will try 'filtered'
4- see post #1
BOAC is offline  
Reply
Old 8th December 2009 | 17:35
  #5 (permalink)  
Administrator
 
Joined: Mar 2001
Aviation Qualifications: PPL
Posts: 8,121
Likes: 686
From: Twickenham, home of rugby
I use Notepad a lot to strip out almost everything except the plain ASCII text - works great for copying text from web pages for re-processing, as you lose all the crap.

Dunno if that helps.

SD
Saab Dastard is offline  
Reply
Old 8th December 2009 | 17:51
  #6 (permalink)  
Thread Starter
Per Ardua ad Astraeus
 
Joined: Mar 2000
Posts: 18,575
Likes: 4
From: UK
Yes - geting the basic text out was fine - it was all the lovely text formatting/colours etc that took all the effort.
BOAC is offline  
Reply

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off


Thread Tools
Search this Thread

Contact Us - Archive - Advertising - Cookie Policy - Privacy Statement - Terms of Service

Copyright © 2026 MH Sub I, LLC dba Internet Brands. All rights reserved. Use of this site indicates your consent to the Terms of Use.