It is currently Mon, 27 Jun 2022 11:33:39 GMT



 
Author Message
 Search for best matched portion of a string
Hi,

I have 2 strings(both may have variable number of words)
first string is: Unix Shell programming by Kernighan and Pike
second string is: Unix Shell Programming by Kernighan

Would it possible thru unix bash shell script to tell that the
portion of the second string "Unix Shell Programming by Kernighan" is
matching partially with the first string?

Further my second string can be:
Unix Shell programming by Pike

The same script(case insensitive) shud tell that the portion of the
second string
"Unix Shell programming by"  matches partially with the first string.
Can anyone help me.
Thanks in advance,
Anil.



 Sat, 16 Jun 2007 21:06:32 GMT   
 Search for best matched portion of a string
In article <1104239192.768504.235...@z14g2000cwz.googlegroups.com>,

if echo "$first_string" | grep "$second_string" >/dev/null
then echo match
else echo no match
fi

I don't think there's anything standard that does this kind of fuzzy
matching.  You probably need to be more precise about what you're
looking for, since there would be a partial match even if the first
string were "Undercover brother" -- the portion "Un" in the second
string matches partially with this.

--
Barry Margolin, bar...@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***



 Sun, 17 Jun 2007 13:32:06 GMT   
 Search for best matched portion of a string
Thanks for the reply Barry.
I was looking for a shell script that will match words....not portion
of words
as I mentioned in my first mail.


 Sun, 17 Jun 2007 14:46:08 GMT   
 Search for best matched portion of a string
In comp.unix.shell Anil <ani...@gmail.com>:

Great, then write one and while your at it, please stop
multi-posting, cross-post if you think it's needed.

Please read: http://www.cs.tut.fi/~jkorpela/usenet/xpost.html

--
Michael Heiming (X-PGP-Sig > GPG-Key ID: EDD27B94)
mail: echo zvpu...@urvzvat.qr | perl -pe 'y/a-z/n-za-m/'
#bofh excuse 358: struck by the Good Times virus



 Sun, 17 Jun 2007 17:02:17 GMT   
 Search for best matched portion of a string

We still need more information. If you are going to drop words, do you
always do it from the right hand end, or do you want to drop the smallest
number of words which makes a match?

What characters can make up a 'word'? Is it any non-space character, or is
it restricted to alpha-numeric characters or just alphabetic characters?
Can we assume that we are dealing with ASCII, or do we need to worry about
Unicode?

The following may be what you want.

#!/bin/bash
string1="Unix Shell programming by Kernighan and Pike"
string2="Unix Shell programming by Pike"

# Lower case the strings, as it is required to be case insensitive
string1="$(echo $string1 | tr 'A-Z' 'a-z')"
string2="$(echo $string2 | tr 'A-Z' 'a-z')"
# put spaces around the second string so we can match word boundaries
string2=" $string2 "

# Loop, curring words off second string, until there is nothing left
while [ "X X" != "X${string2}X" ]
do
        # See if the first string contains the second string
        case " $string1 " in
        *"$string2"*)
                # Yes, print out the second string
                echo "$string2"
                break;
        esac
        # replace 'space word space' by 'space
        string2="${string2% * } "
done



 Mon, 18 Jun 2007 10:37:27 GMT   
 Search for best matched portion of a string

The strings that I mentioned are just examples...may be the words can
be dropped at the start or the end.

The words are ASCII, seperated by white spaces(only) and are built of
alpha-numeric characters.



 Mon, 18 Jun 2007 14:15:48 GMT   
 Search for best matched portion of a string

    And what are the criteria for matching?

    Please restate exactly what you want to do. Examples are not
    enough.

--
    Chris F.A. Johnson                  http://cfaj.freeshell.org/shell
    ===================================================================
    My code (if any) in this post is copyright 2004, Chris F.A. Johnson
    and may be copied under the terms of the GNU General Public License



 Mon, 18 Jun 2007 14:45:47 GMT   
 Search for best matched portion of a string

Then neither of the solutions posted to comp.lang.awk will work since
they just account for words being dropped from the end.

And please don't continue these separate threads in multiple NGs. I'm
cross-posting this response to comp.lang.awk so Anil can bring us all up
to speed on what he really wants and decide which NG(s) to continue this in.

        Ed.



 Mon, 18 Jun 2007 21:53:21 GMT   
 Search for best matched portion of a string

A ksh93 solution, may work in bash:

typeset -l WORD
typeset -l STRING="Unix Shell programming by Kernighan and Pike"
MATCH=( Unix Shell Programming by Kernighan )
GOTIT=1
for WORD in "${MATCH[@]}"
do
[[ "${STRING}" = "${WORD} "* ||
"${STRING}" = *" ${WORD} "* ||
"${STRING}" = *" ${WORD}" ]] && GOTIT=0
done

(( GOTIT == 0 )) && print got partial match
(( GOTIT == 1 )) && print no match

--
Dana French



 Tue, 19 Jun 2007 00:17:52 GMT   
 Search for best matched portion of a string

I haven't triedd it, but that looks like it'd report that a MATCH string
of "egg shell" partially matches the STRING "Unix Shell programming by
Kernighan and Pike", which I don't believe is what the OP wanted.

        Ed.



 Tue, 19 Jun 2007 00:24:26 GMT   
 Search for best matched portion of a string

I thought of a better ksh93 solution (maybe bash):

STRING="Unix Shell programming by Kernighan and Pike"
MATCH="Unix Shell Programming by Kernighan"
MATCH="${MATCH//+([ ])/|}"
[[ "${STRING}" = *@(${MATCH})* ]] && print got partial match
--
Dana French



 Tue, 19 Jun 2007 00:28:18 GMT   
 Search for best matched portion of a string

Yes it would match, but it is a partial match which is what I thought
the OP wanted.

--
Dana French



 Tue, 19 Jun 2007 00:29:50 GMT   
 Search for best matched portion of a string
Chris,

Let me write down the req. :
- Inputs are 2 strings that have words(variable number of words)
- the string search shud be case sensitive
- the given strings contain alpha-numeric chars(plain text) and no
special chars

- if the input is
string1="Unix Shell programming by Kernighan and Pike"
string2="Unix Shell programming by Pike"

output shud be "Unix Shell programming by"

- if the input is
string1="Unix Shell programming by Kernighan and Pike"
string2="Unix Shell by Pike"

the output shud be
"Unix Shell by"

- if the input is
string1="Unix Shell programming by Kernighan and Pike"
string2="Shell programming by Kernighan"

the output shud be ...
"Shell programming by Kernighan"

Hope I am clear now



 Tue, 19 Jun 2007 14:11:30 GMT   
 Search for best matched portion of a string

    For that example, the rule would read: "Return the longest
    consecutive portion of the strings that is common to both."

     What is the rule for this?

     Why would the result not be, "Unix Shell by Pike"?

     Or, to match the first example, "Unix Shell"?

     Or, to match the following example, "Unix Shell Pike" (rule:
     "Return matching portions from beginning and end of both
     strings")?

     Or, also matching the following example, "Unix Shell by Pike",
     using the rule I give below.

     The rule for this last example, would be: "Return all words
     common to both strings".

     No, because your examples are inconsistent.

     What rule fits all three examples?

--
    Chris F.A. Johnson                  http://cfaj.freeshell.org/shell
    ===================================================================
    My code (if any) in this post is copyright 2004, Chris F.A. Johnson
    and may be copied under the terms of the GNU General Public License



 Tue, 19 Jun 2007 16:04:31 GMT   
 Search for best matched portion of a string
Sorry for the confusion created.
After seeing Chris's response,I would like to stick to:
The rule for this last example, would be: "Return all words
common to both strings".......

Please ignore the other examples.



 Fri, 22 Jun 2007 17:27:51 GMT   
 
   [ 17 post ]  Go to page: [1] [2]

Similar Threads

1. XPosting: Search for best matched string

2. Solaris 2.6: unix command to search binaries for char string portion (not null terminated

3. best way to search for multiple strings in ps -elf output (Solaris 2.6)

4. Referencing portions of a string with sed

5. search string in other string

6. search string bbb in block b after string aaa found in block a

7. Solaris 2.6: command to search binaries for char string portion (not null terminated)?

8. netfilter string match

9. String matching algorythm

10. Matching strings with regular expressions


 
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group.
Designed by ST Software