On Sat, 3 Nov 2001, Subramanian Radhakrishnan wrote:
> How to implement stop words in Namazu search. what are all the files
> are need to be modified for this purpose...
Before you try my way, I'd really suggest that you pre-process the query
string with something like perl before sending it to namazu. Iff that
doesn't work for you, then try this:
Instructions follow:
These are for namazu-2.0.5, I have not tested with later versions. It
is also not the best way to do it, I can think of better ways, but
haven't tried it yet. I will try and get this to work similarly to the
rest of namazu, but for now, it works for me.
I also have an implementation of synonyms along the same lines.
I have attached two files - stop-list.c and stop-list.h
Additionally, you will need to create a text file called stopwords.txt
with one word per line. This file will be in the same directory as your
index.
you have to put these in nmz/ directory, and add the following to
nmz/query.c:
#include "stop-list.h" (at the top)
nmz_make_query():
after:
/* If too much items in query, return with error */
if (tokennum > QUERY_TOKEN_MAX) {
return ERR_TOO_MANY_TOKENS;
}
add:
/* Read stop list from file */
read_stop_list();
after:
if (query.str[i] != '\0')
query.str[i++] = '\0';
add:
/* If the word is in the stop list, then purge it */
if(is_stop_word(query.tab[tokennum])) {
query.tab[tokennum] = (char *) NULL;
}
after end of for loop, add:
/* Clear stop list */
clear_word_list();
--
The program isn't debugged until the last user is dead.
Visit my webpage at http://www.ncst.ernet.in/~philip/
Read my writings at http://www.ncst.ernet.in/~philip/writings/
MSN philiptellis Yahoo! philiptellis
AIM philiptellis ICQ 129711328
Attachment:
stop-list.tar.gz
Description: GNU Zip compressed data