It is currently Tue, 17 May 2022 03:01:48 GMT



 
Author Message
 extracting text between patterns

i am looking to find a way to 'extract' text from between two patterns.
  the patterns could be on the same line, or they could be on different
lines.

i.e.
pattern1sometextpattern2

and i only want:
sometext

OR
pattern1sometext
something else
somemorepattern2
and i only want:

sometext
something else
somemore

if my patterns appear more than once (pattern1 followed by pattern2
more than once) is there a way to specify which set to use?



 Sun, 24 Feb 2008 01:32:28 GMT   
 extracting text between patterns

One possible solution with sed:
sed -n ':A;/pattern1.*pattern2/b B;N;b A;:B;s/pattern1\(.*\)pattern2/\1/gp'
Regards
--
pgancarz, at, o2, pl



 Sun, 24 Feb 2008 01:51:14 GMT   
 extracting text between patterns

I don't know how robust you need this to be wrt handling pathological
cases (e.g. pattern2 is part of pattern1), but this will work for the
samples you posted:

awk '/pattern1/,/pattern2/{gsub(/pattern1|pattern2/,"");print}'

Sure, just define a counter, increment it when pattern2 is found, and
only print (or sub()) when the counter hits the magic number you care about.

        Ed.



 Sun, 24 Feb 2008 01:50:03 GMT   
 extracting text between patterns

$ echo "pattern1sometextpattern2" | \
perl -l -00ne'@x = /(?<=pattern1)(.+?)(?=pattern2)/sg and print $x[0]'
sometext

$ echo "pattern1sometext

something else
somemorepattern2" | \
perl -l -00ne'@x = /(?<=pattern1)(.+?)(?=pattern2)/sg and print $x[0]'
sometext
something else
somemore

$ echo "pattern1sometextpattern2
something pattern1 else
somemorepattern2" | \
perl -l -00ne'@x = /(?<=pattern1)(.+?)(?=pattern2)/sg and print $x[1]'
 else
somemore

John
--
use Perl;
program
fulfillment



 Sun, 24 Feb 2008 04:22:11 GMT   
 extracting text between patterns

- If your text is non-XML/HTML, then you can replace 'pattern1' and
  'pattern2' with '<tag>' and '</tag>', and run it through XML parser.

- If regex is applicable, then you can try something like
        '\<pattern1\>.*\<pattern2\>'
  But, it will be slow, unless you can be more specific for '.*'.

- Otherwise, you need to split on 'pattern1', then split on 'pattern2'.
        a=`< file`
        match -2 "$a" pattern1 b
        match -2 "${b[1]}" pattern2 c
        echo ${c[0]}
  will give you the first content.  Repeat.

I used to have this feature in my patch for Bash, but removed it because
the usage was too complicated and difficult to remember.  If this
problem is important enough, then I can patch it to Bash. :-)

--
William Park <opengeome...@yahoo.ca>, Toronto, Canada
ThinFlash: Linux thin-client on USB key (flash) drive
           http://home.eol.ca/~parkw/thinflash.html
BashDiff: Super Bash shell
          http://freshmeat.net/projects/bashdiff/



 Sun, 24 Feb 2008 04:57:51 GMT   
 extracting text between patterns

   No external command is necessary, if you are using bash or ksh93;
   you can load the entire file into a variable:

file=$( < "$FILENAME" ) ## This is assumed before all the following scripts.

   With other shells, an external command is necessary:

file=$( cat "$FILENAME" )

   To extract the first occurrence:

extract=${file#*pattern1}
extract=${extract%%pattern2*}

   To extract the last occurrence:

extract=${file##*pattern1}
extract=${extract%pattern2*}

   To extract the Nth occurrence, use this N-1 times then the first
   script:

extract=${file#*pattern2}

--
    Chris F.A. Johnson                     <http://cfaj.freeshell.org>
    ==================================================================
    Shell Scripting Recipes: A Problem-Solution Approach, 2005, Apress
    <http://www.torfree.net/~chris/books/cfaj/ssr.html>



 Sun, 24 Feb 2008 15:13:53 GMT   
 extracting text between patterns

Finde the 2nd occurrence:

ruby -0777ne'$_=~/(?:.*?pattern1(.*?)pattern2){2}/m;puts $1' file



 Sun, 24 Feb 2008 18:57:18 GMT   
 
   [ 7 post ] 

Similar Threads

1. Pattern matching and extracting the data which matches the pattern

2. Script to extract portions of text from a text file

3. Extracting lines from a text file that match a certain criteria to another text file

4. sed extract pattern from stream

5. Extracting the String Between Two Patterns

6. extracting a pattern from file

7. extracting a pattern from a line

8. Extracting patterns from input

9. Pattern Extract

10. sed: extracting a pattern


 
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group.
Designed by ST Software