Java Mailing List Archive

http://www.java2.5341.com/

Home » nutch-user.lucene »

Edit index structure

Matthias W.

2008-09-11


Author LoginPost Reply

Hi,
is it possible to edit the index structure of nutch?

I have following problem:
The files will be indexed by Nutch, the frontend will be implemented with
Zend Framework 1.6.0 (Zend_Search_Lucene).
Zend_Search_Lucene IMO doesn't support the nutch index structure, so I can
only read the title, url, digest-code, tstamp, and score from the nutch
index but I'm not able to read the digest itself or other fields.
Can I change the fields to be stored in the index? where?
Or are there other possibilities to solve this problem?

I've got an additional question concerning nutch (version 0.9):
Does nutch check the MIME-Type of files before indexing or check it only the
extension of the files to get the matching parser?
--
Sent from the Nutch - User mailing list archive at Nabble.com.

©2008 java2.5341.com - Jax Systems, LLC, U.S.A.