It is currently Thu, 20 Jan 2022 04:31:03 GMT



 
Author Message
 SED & line lengths
Hi!

A while ago I tried to modify a binary file that had some
text in it. The text was an sql 'SELECT' statement.

Anyway, SED complained of 'exceeding line length'.

After reading the man page, I discovered that sed is limited
to reading lines longer than 8192 bytes.

MY QUESTION: How could I have used UNIX utilities to
overcome my problem.

The only solution I had at the time was to write a C program.

Your insight would be very appreciated.

Thanks.



 Mon, 04 Feb 2002 03:00:00 GMT   
 SED & line lengths

This may not qualify as a strictly Unix solution, but if you've got Perl
installed the following probably works (it does for me, anyhow):

#!/usr/bin/perl5

open IN, "$ARGV[0]";
$fsize = -s IN;
read IN, $theData, $fsize;
close IN;

$theData =~ s/\r/"\r\n"/eg;
open OUT, ">$ARGV[1]";
print OUT $theData;
close OUT;

This is invoked by

<Perl program> <input file with long line> <new output file>

This assumes that the gist of the problem is you have all <CR>s and no
<LF>s.

- Dan T.



 Mon, 04 Feb 2002 03:00:00 GMT   
 SED & line lengths

Ynot> This may not qualify as a strictly Unix solution, but if you've got Perl
Ynot> installed the following probably works (it does for me, anyhow):

[longish Perl program deleted]

All this being more-or-less equivalent to:

        perl -pe 's/\r/\r\n/g' <input >output

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<mer...@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!



 Mon, 04 Feb 2002 03:00:00 GMT   
 SED & line lengths

If the problem is that all \r's in the file need to become
\r\n's, then this looks it will work, but it also looks overly
complex for what it does...
   perl -p015e '$_.="\n"' inputfile >outputfile

                --Ken Pizzini



 Tue, 05 Feb 2002 03:00:00 GMT   
 SED & line lengths
Thank you all for your replies.

What does 's/\r/\r\n/g' do as a sed expression?

\r - ?
\n - newline?

And how does invoking it via perl differ from
calling the sed program directly?



 Tue, 05 Feb 2002 03:00:00 GMT   
 SED & line lengths

: What does 's/\r/\r\n/g' do as a sed expression?

: \r - ?
: \n - newline?

: And how does invoking it via perl differ from
: calling the sed program directly?

\r is the traditional "carriage return" (ASCII octal 15) which means
nothing special to sed, which looks for the "linefeed" character (ASCII
octal 12) as its line delimiter - if sed sees no linefeeds then it doesn't
even see one complete line to process.

: perl -pe 's/\r/\r\n/g' <input >output

This is obviously *much* cleaner than the little program I suggested -
I've got to become more familiar with Perl myself!

- Dan T.



 Tue, 05 Feb 2002 03:00:00 GMT   
 SED & line lengths
Thank you for your reply.

I am getting confused now - what is the difference between
a carriage return and a line feed?

Does 's/\r/\r\n/g'  match a CR and replace it with
CR and LN? - I'm confused because I thought CR and LF
were the same.

Thanks.



 Tue, 05 Feb 2002 03:00:00 GMT   
 SED & line lengths

: I am getting confused now - what is the difference between
: a carriage return and a line feed?

: Does 's/\r/\r\n/g'  match a CR and replace it with
: CR and LN? - I'm confused because I thought CR and LF
: were the same.

I'm certainly not the best source for this information, but...

CRs and LFs are indeed distinct characters. Years ago the old teletype
machines required separate characters to instruct them to go down one line
(linefeed) and back to the left edge (carriage return) and both characters
have survived until today in the ASCII encoding specification. It's up to
individual applications just what significance these carry. Speaking to
the issue at hand, though, we receive huge PostScript files from all over,
and when I found that some, but not all, wouldn't respond properly to my
sed commands I examined them and found the line delimiters in these to be
merely \r (i.e., octal 12). I can't remember why I replaced these with
\r\n instead of just \n, but I know it made the sed commands run as
intended. Hope my gibberish is faintly enlightening :)

- Dan T.



 Tue, 05 Feb 2002 03:00:00 GMT   
 SED & line lengths

Dan> : perl -pe 's/\r/\r\n/g' <input >output

Dan> This is obviously *much* cleaner than the little program I suggested -
Dan> I've got to become more familiar with Perl myself!

If you're interested, I can recommend a book or two, and maybe even
a training course.

:-)

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<mer...@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!



 Tue, 05 Feb 2002 03:00:00 GMT   
 SED & line lengths

On 1999-08-19 d...@reply.here said:
   >Hi!
   >A while ago I tried to modify a binary file that had some
   >text in it. The text was an sql 'SELECT' statement.
   >Anyway, SED complained of 'exceeding line length'.
   >After reading the man page, I discovered that sed is limited
   >to reading lines longer than 8192 bytes.
   >MY QUESTION: How could I have used UNIX utilities to
   >overcome my problem.
Sed is not meant for editing binary files.  Perhaps you could have piped the
file through od.
   >The only solution I had at the time was to write a C program.
   >Your insight would be very appreciated.
That was probably the best solution, but I couldn't be sure without knowing
more about the file and the programs which create and use it.

Net-Tamer V 1.08X - Test Drive



 Tue, 05 Feb 2002 03:00:00 GMT   
 SED & line lengths

A "carrige return" is codepoint 015 (13. 0xf) in the ASCII
charater set (and its descendents).  A linefeed is codepoint 013
(10. 0xa).  The original design for these characters was that a
carrage return would act like pushing the platen (carrage) of a
typewriter over so that the next keystroke would land in the
leftmost column, and a linefeed would advance the paper up one
line.

A "newline" is the character or character sequence which
terminate lines in text files.  On unix/posix systems running
on ASCII-like systems, a line feed is used for this purpose.
On Macintosh systems, a carrage return is used for this.
On CP/M-like systems (including DOS) the pair CR+LF is used for
this.

If the implementation of sed being used correlates \r with a CR
and correlates \n with a LF (a not-too-uncommon state of
affairs), then yes, it will replace every instance of CR with
CR+LF.  This might be useful for converting Macintosh text files
for use on DOS systems.

                --Ken Pizzini



 Wed, 06 Feb 2002 03:00:00 GMT   
 SED & line lengths

On 1999-08-20 d...@reply.here said:
   >Thank you all for your replies.
   >What does 's/\r/\r\n/g' do as a sed expression?
   >\r - ?
   >\n - newline?
   >And how does invoking it via perl differ from
   >calling the sed program directly?
\r is the CR character, \n is LF also known as newline, in programs that
recognize those backslash sequences, such as perl or awk.  Sed recognizes
\n in the pattern but not in the replacement string, and doesn't recognize
\r or other \letter or \number strings.

Net-Tamer V 1.08X - Test Drive



 Wed, 06 Feb 2002 03:00:00 GMT   
 
   [ 12 post ] 

Similar Threads

1. sed expression for truncating line length

2. sed - same length for all lines

3. a SED script to find length of the longest line in a file

4. sed Max line length?

5. to all sed hackers - joining lines with sed

6. SED / Line / (Need to get data from searched line from specific char to char)

7. Sed: merging lines recursively depending on line pattern

8. SED: Converting 5 line script into 1 line script

9. what sed command to print the first line and the last line


 
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group.
Designed by ST Software