Java Mailing List Archive

http://www.java2.5341.com/

Home » nutch-user.lucene »

index-more and contentLength field

Hilkiah Lavinier

2008-08-06

Replies: Find Java Web Hosting

Author LoginPost Reply
Hi Guys,

Can anyone explain 'how' the contentLength field is populated? I've been indexing a few sites and some seem to have this field available while others don't. I really don't understand why. I've looked through MoreIndexingFilter.java, ParseData.java, HttpHeaders.java and Metadata.java source files as well as the logs of the various crawls (fetch...index) but can't seem to figure out why..

I'm using nutch trunk with index-more and query-more enabled.

Regards,

Hilkiah G. Lavinier MEng (Hons), ACGI
6 Winston Lane,
Goodwill,
Roseau, Dominica


Mbl: (767) 275 3382
Fax: (767) 440 4991
VoIP (646) 432 4487


Email: hilkiah@(protected)
Email: hilkiah.lavinier@(protected)
IM: Yahoo hilkiah / MSN hilkiahlavinier@(protected)
IM: ICQ #8978201 / AOL hilkiah21



   
©2008 java2.5341.com - Jax Systems, LLC, U.S.A.