Namazu-devel-en(old)
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[no subject]
- From: Roel Brand <qrbr@xxxxxx>
- Date: Mon, 19 Mar 2001 11:07:40 +0100 (MET)
- X-ml-name: namazu-devel-en
- X-mail-count: 00021
Subject: Trying to hook into Namazu
X-Mailer: Cronos II 0.2.1 (gnome-libs 1.2.1; Linux 2.2.13; i686)
X-CronosII-Account: qrbr
MIME-Version: 1.0
Content-Type: text/plain
Content-Transfer-Encoding: 8bit
CC: klok@xxxxxx, mbru@xxxxxx
Hello all,
Let me first introduce myself: I am a graduate student of computer science from the
Eindhoven University of Technology, the Netherlands. I am working on a document
retieval research project and for my research I need a search engine which can
spit out the following things about its results:
- some kind of document ID
- document frequency for all search terms (so .. in how many documents does term A occur)
- term frequency for all terms for each found document (so .. how many times does term A occur in document i)
- the length (preferably in characters) of each found document
- size of the searched document collection
I have been digging throught the Namazu source code, looking for a location where I could
place a hook to add these things to the search result but I found that it will cost me a lot of time
just to find out what some of the functions do without having an idea of the whole architecture.
I think I could change nmz_get_hlist() to add raw term frequencies for all search terms
to the nmz_data struct it is building. But I'm not sure because I don't understand what
nmz_read_unpackw() and nmz_get_unpackw() do.
Can someone please give his/her idea of the feasibility of the things I'm trying to do?
Sincerely,
Roel Brand