Namazu-users-en(old)


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Symbol matching



In article <Pine.LNX.4.21.0105041552380.31987-100000@xxxxxxxxxxxxxxxxxxxxx>
philip@xxxxxxxxxxxxxxxxxxxx writes:

>> However, looking through the source, I see no evidence of this
>> happening.  Could anyone provide pointers on where this would be done?

There is wordcount_sub() function in mknmz.

> sub wordcount_sub ($$\%) {
>     my ($text, $weight, $word_count) = @_;
> 
>     # Count frequencies of words in a current document.
>     # Handle symbols as follows.
>     #
>     # tcp/ip      ->  tcp/ip,     tcp,      ip
>     # (tcp/ip)    ->  (tcp/ip),   tcp/ip,   tcp, ip
>     # ((tcpi/ip)) ->  ((tcp/ip)), (tcp/ip), tcp
>     #
>     # Don't do processing for nested symbols.
>     # NOTE: When -K is specified, all symbols are already removed.
-- 
NOKUBI Takatsugu
E-mail: knok@xxxxxxxxxxxxx
	knok@xxxxxxxxxx / knok@xxxxxxxxxx