Java Mailing List Archive

http://www.java2.5341.com/

Home » nutch-user.lucene »

How to make a complete site crawl on regular basis?

plat hpc

2008-08-13

Replies: Find Java Web Hosting

Author LoginPost Reply
Hi,

I am new to nutch, managed to installed nutch and set it up. Did the first
crawl few months back. Now as my site has some new posts and updates, but
the nutch wasn't reflecting. So i did another crawl : bin/nutch crawl urls
-dir crawl -depth 4 -topN 200 But then the new posts and updates didn't
seems to be updated.

Would anyone please kindly tell me the steps/command to get nutch to crawl
my whole site and on regular updates?

Thanks.
©2008 java2.5341.com - Jax Systems, LLC, U.S.A.