Java Mailing List Archive

http://www.java2.5341.com/

Home » nutch-user.lucene »

Extracting Content-Length

Kevin MacDonald

2008-09-15


Author LoginPost Reply
When doing a crawl of about 3500 urls to a depth of 1 I find that, after
dumping the segment data, the Content-Length field only exists in Content
Metadata for about half of the urls that were crawled. Does nutch calculate
the content length, or does it simply display the header value if it is
present?

Kevin
©2008 java2.5341.com - Jax Systems, LLC, U.S.A.