Hi,
I have successfully configured NUTCH 0.9, which is crawling number of sites and after that searching is also happening properly.
However, now I want to crawl password protected pages using NUTCH. In order to access those pages I should have a valid user name and password. I have configured the user name and password in my nutch-site.xml and httpclient-auth.xml
However it is not crawling.
I have attached nutch-site.xml, httpclient-auth.xml and hadoop.log in the Zip file for your reference. Kindly check and let me know what is missing from my end.
CONFIGURATION:
nutch-2008-07-10_04-01-48.tar (I have download from
http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ which contains your patch for HttpAuthentication)Windows XP, Cygwin, jdk1.6.0
Thanks in advance…
Please help....
Best regards,
Biswajit
Attachment: Nutch.zip (zipped)