Java Mailing List Archive

http://www.java2.5341.com/

Home » nutch-user.lucene »

alternation of topN

Marcel T

2008-11-17

Replies: Find Java Web Hosting

Author LoginPost Reply

Hi, there,

In the example of nutch crawling, there is a topN parameter. This parameter sets limit on the total crawled pages in each depth level. This limit is on all urls. Is there any way I set a limit on each page. In other words, for example the limit is 5, which means at most 5 more linked will be added into the queue from each url.

Thanks.
©2008 java2.5341.com - Jax Systems, LLC, U.S.A.