Author Login
Post Reply
I created my URL list file from my Google sitemap with all URLs in it, and
then set the depth of the crawler to 1, so I don't want the crawler to
follow any sublinks.
When I look at the log, I found that the crawler doesn't follow the URL list
line by line, but randomly. Is there a reason why it doesn't do so?
Or do I actually have to set the depth to 0 instead of 1 ?
(Because the crawling process takes a while, I wanted to check by the log,
at which URL the crawler is at at the moment, but couldn't do it.)
--
Sent from the Nutch - User mailing list archive at Nabble.com.