Author |
Message |
Anil #1 / 17
|
 Search for best matched portion of a string
Hi, I have 2 strings(both may have variable number of words) first string is: Unix Shell programming by Kernighan and Pike second string is: Unix Shell Programming by Kernighan Would it possible thru unix bash shell script to tell that the portion of the second string "Unix Shell Programming by Kernighan" is matching partially with the first string? Further my second string can be: Unix Shell programming by Pike The same script(case insensitive) shud tell that the portion of the second string "Unix Shell programming by" matches partially with the first string. Can anyone help me. Thanks in advance, Anil.
|
Sat, 16 Jun 2007 21:06:32 GMT |
|
 |
Barry Margoli #2 / 17
|
 Search for best matched portion of a string
In article <1104239192.768504.235...@z14g2000cwz.googlegroups.com>,
if echo "$first_string" | grep "$second_string" >/dev/null then echo match else echo no match fi
I don't think there's anything standard that does this kind of fuzzy matching. You probably need to be more precise about what you're looking for, since there would be a partial match even if the first string were "Undercover brother" -- the portion "Un" in the second string matches partially with this. -- Barry Margolin, bar...@alum.mit.edu Arlington, MA *** PLEASE post questions in newsgroups, not directly to me ***
|
Sun, 17 Jun 2007 13:32:06 GMT |
|
 |
Anil #3 / 17
|
 Search for best matched portion of a string
Thanks for the reply Barry. I was looking for a shell script that will match words....not portion of words as I mentioned in my first mail.
|
Sun, 17 Jun 2007 14:46:08 GMT |
|
 |
Michael Heimin #4 / 17
|
 Search for best matched portion of a string
In comp.unix.shell Anil <ani...@gmail.com>:
Great, then write one and while your at it, please stop multi-posting, cross-post if you think it's needed. Please read: http://www.cs.tut.fi/~jkorpela/usenet/xpost.html -- Michael Heiming (X-PGP-Sig > GPG-Key ID: EDD27B94) mail: echo zvpu...@urvzvat.qr | perl -pe 'y/a-z/n-za-m/' #bofh excuse 358: struck by the Good Times virus
|
Sun, 17 Jun 2007 17:02:17 GMT |
|
 |
Icarus Sparr #5 / 17
|
 Search for best matched portion of a string
We still need more information. If you are going to drop words, do you always do it from the right hand end, or do you want to drop the smallest number of words which makes a match? What characters can make up a 'word'? Is it any non-space character, or is it restricted to alpha-numeric characters or just alphabetic characters? Can we assume that we are dealing with ASCII, or do we need to worry about Unicode? The following may be what you want. #!/bin/bash string1="Unix Shell programming by Kernighan and Pike" string2="Unix Shell programming by Pike" # Lower case the strings, as it is required to be case insensitive string1="$(echo $string1 | tr 'A-Z' 'a-z')" string2="$(echo $string2 | tr 'A-Z' 'a-z')" # put spaces around the second string so we can match word boundaries string2=" $string2 " # Loop, curring words off second string, until there is nothing left while [ "X X" != "X${string2}X" ] do # See if the first string contains the second string case " $string1 " in *"$string2"*) # Yes, print out the second string echo "$string2" break; esac # replace 'space word space' by 'space string2="${string2% * } " done
|
Mon, 18 Jun 2007 10:37:27 GMT |
|
 |
Anil #6 / 17
|
 Search for best matched portion of a string
The strings that I mentioned are just examples...may be the words can be dropped at the start or the end.
The words are ASCII, seperated by white spaces(only) and are built of alpha-numeric characters.
|
Mon, 18 Jun 2007 14:15:48 GMT |
|
 |
Chris F.A. Johnso #7 / 17
|
 Search for best matched portion of a string
And what are the criteria for matching? Please restate exactly what you want to do. Examples are not enough. -- Chris F.A. Johnson http://cfaj.freeshell.org/shell =================================================================== My code (if any) in this post is copyright 2004, Chris F.A. Johnson and may be copied under the terms of the GNU General Public License
|
Mon, 18 Jun 2007 14:45:47 GMT |
|
 |
Ed Morto #8 / 17
|
 Search for best matched portion of a string
Then neither of the solutions posted to comp.lang.awk will work since they just account for words being dropped from the end.
And please don't continue these separate threads in multiple NGs. I'm cross-posting this response to comp.lang.awk so Anil can bring us all up to speed on what he really wants and decide which NG(s) to continue this in. Ed.
|
Mon, 18 Jun 2007 21:53:21 GMT |
|
 |
dfre.. #9 / 17
|
 Search for best matched portion of a string
A ksh93 solution, may work in bash: typeset -l WORD typeset -l STRING="Unix Shell programming by Kernighan and Pike" MATCH=( Unix Shell Programming by Kernighan ) GOTIT=1 for WORD in "${MATCH[@]}" do [[ "${STRING}" = "${WORD} "* || "${STRING}" = *" ${WORD} "* || "${STRING}" = *" ${WORD}" ]] && GOTIT=0 done (( GOTIT == 0 )) && print got partial match (( GOTIT == 1 )) && print no match -- Dana French
|
Tue, 19 Jun 2007 00:17:52 GMT |
|
 |
Ed Morto #10 / 17
|
 Search for best matched portion of a string
I haven't triedd it, but that looks like it'd report that a MATCH string of "egg shell" partially matches the STRING "Unix Shell programming by Kernighan and Pike", which I don't believe is what the OP wanted. Ed.
|
Tue, 19 Jun 2007 00:24:26 GMT |
|
 |
dfre.. #11 / 17
|
 Search for best matched portion of a string
I thought of a better ksh93 solution (maybe bash): STRING="Unix Shell programming by Kernighan and Pike" MATCH="Unix Shell Programming by Kernighan" MATCH="${MATCH//+([ ])/|}" [[ "${STRING}" = *@(${MATCH})* ]] && print got partial match -- Dana French
|
Tue, 19 Jun 2007 00:28:18 GMT |
|
 |
dfre.. #12 / 17
|
 Search for best matched portion of a string
Yes it would match, but it is a partial match which is what I thought the OP wanted. -- Dana French
|
Tue, 19 Jun 2007 00:29:50 GMT |
|
 |
Anil #13 / 17
|
 Search for best matched portion of a string
Chris, Let me write down the req. : - Inputs are 2 strings that have words(variable number of words) - the string search shud be case sensitive - the given strings contain alpha-numeric chars(plain text) and no special chars - if the input is string1="Unix Shell programming by Kernighan and Pike" string2="Unix Shell programming by Pike" output shud be "Unix Shell programming by" - if the input is string1="Unix Shell programming by Kernighan and Pike" string2="Unix Shell by Pike" the output shud be "Unix Shell by" - if the input is string1="Unix Shell programming by Kernighan and Pike" string2="Shell programming by Kernighan" the output shud be ... "Shell programming by Kernighan" Hope I am clear now
|
Tue, 19 Jun 2007 14:11:30 GMT |
|
 |
Chris F.A. Johnso #14 / 17
|
 Search for best matched portion of a string
For that example, the rule would read: "Return the longest consecutive portion of the strings that is common to both."
What is the rule for this? Why would the result not be, "Unix Shell by Pike"? Or, to match the first example, "Unix Shell"? Or, to match the following example, "Unix Shell Pike" (rule: "Return matching portions from beginning and end of both strings")? Or, also matching the following example, "Unix Shell by Pike", using the rule I give below.
The rule for this last example, would be: "Return all words common to both strings".
No, because your examples are inconsistent. What rule fits all three examples? -- Chris F.A. Johnson http://cfaj.freeshell.org/shell =================================================================== My code (if any) in this post is copyright 2004, Chris F.A. Johnson and may be copied under the terms of the GNU General Public License
|
Tue, 19 Jun 2007 16:04:31 GMT |
|
 |
Anil #15 / 17
|
 Search for best matched portion of a string
Sorry for the confusion created. After seeing Chris's response,I would like to stick to: The rule for this last example, would be: "Return all words common to both strings"....... Please ignore the other examples.
|
Fri, 22 Jun 2007 17:27:51 GMT |
|
 |
|