It is currently Tue, 17 May 2022 02:51:01 GMT



 
Author Message
 Postscript file corrupted - extracting text/patching
I have obtained a Postscript file (apparently generated with Microsoft Word)
of which I can only read the first page using gv (an error - moveto - is
generated on the next page). I have the following questions:

a) How do I extract just the text from the Postscript file? How is the raw
text in a Postscript file encoded?

b) Is it possible to fix a corrupted Postscript file (e.g. by extracting the
usable portions to a new file)?

Any help with the above would be greatly appreciated.

Best regards,
Theo van der Merwe (nt...@iafrica.com)



 Tue, 11 Nov 2003 20:22:12 GMT   
 Postscript file corrupted - extracting text/patching
"Theo van der Merwe" (nt...@iafrica.com) writes:

i'm sure there are pgms to accomplish the above but allow me to suggest
reading the file with xpdf - might work!

--
Merci........Yvan          Pour le plein air: Club Vertige
                               http://www.ncf.ca/vertige



 Tue, 11 Nov 2003 22:37:15 GMT   
 Postscript file corrupted - extracting text/patching
Theo van der Merwe wrote:

Have you tried "ps2ascii" (comes with ghostscript)?

PS2ASCII(1)             Ghostscript Tools             PS2ASCII(1)

NAME
       ps2ascii  -  Ghostscript translator from PostScript or PDF
       to ASCII

SYNOPSIS
       ps2ascii [ input.ps [ output.txt ] ]
       ps2ascii input.pdf [ output.txt ]

DESCRIPTION
       ps2ascii  uses  gs(1)   to   extract   ASCII   text   from
       PostScript(tm)  or  Adobe  Portable  Document Format (PDF)
       files. If no files are specified on the command  line,  gs
       reads from standard input; but PDF input must come from an
       explicitly-named file, not standard input.  If  no  output
       file  is  specified, the ASCII text is written to standard
       output.

--

-John (John.Thomp...@attglobal.net)



 Wed, 12 Nov 2003 05:27:43 GMT   
 Postscript file corrupted - extracting text/patching
"Theo van der Merwe" (nt...@iafrica.com) writes:

First of all be sure to use ghostscript 7.0. I was using the old 5.x
an I find 7.0 much improved.

There should be a ps2ascii utility included with ghostscript.

The utility fixps (probably from the psutils) might help.
I once had luck with file and dd in extracting a postscript readable
by 5.x from a newer postscript file generated some Adobe Program. The
good old file told me something like 'x bytes of garbage at the
beginning, Postscript file from byte x+1 to y, TIFF image from byte
y+1 to z', and with dd I extracted the x+1-to-y part only.

--
Stefano - Hodie septimo Kalendas Iunias MMI est



 Wed, 12 Nov 2003 19:24:04 GMT   
 
   [ 4 post ] 

Similar Threads

1. Postscript file corrupted - extracting text/patching

2. Extract sections of delimeted text from postscript file

3. Question on Extracting Text From Postscript Files

4. Extracting lines from a text file that match a certain criteria to another text file

5. extract text from postscript

6. Script to extract portions of text from a text file

7. Extracting text from a file

8. How to extract columns from a text file?

9. Question: Extracting text from a file.

10. Extract specific text from file using sed


 
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group.
Designed by ST Software