Java Mailing List Archive

http://www.java2.5341.com/

Home » nutch-user.lucene »

Retrieving data for a particular URL from crawldb?

Viksit Gaur

2008-06-12

Replies: Find Java Web Hosting

Author LoginPost Reply
Hi all,

Is there a way to retrieve a particular page from the nutch crawl using
the URL as a key? Since I don't know the segment directory which this
page was put into, I can't use nutch readseg. But that tool only gives
stats about the URL and not its contents.

Any ideas on the best way to do this?

Thanks,
Viksit
©2008 java2.5341.com - Jax Systems, LLC, U.S.A.