Java Mailing List Archive

http://www.java2.5341.com/

Home » nutch-user.lucene »

Crawl and Merge questions

Alex Basa

2008-10-23


Author LoginPost Reply
Does anyone know what crawl output directories are required on a successful crawl? Are crawldb, indexes, index, linkdb and segments all required to have a successful merge?

I'm crawling on 5 servers and writing to the SAN. Everything goes fast and fine (up to several million documents). My problem is when I merge the indexes using the mergecrawls.sh, it takes a very long time. Is there any performance tuning that you can do to speed up the mergecrawls?

Thanks in advance,

Alex


   
©2008 java2.5341.com - Jax Systems, LLC, U.S.A.