Author Login
Post Reply
Hi,
I'm using Nutch to crawl an intranet site that is behind form
authentication. I know Nutch doesn't support form authentication yet
(right?), but I think this site would also work with cookies. I have
the right set of cookie names and values, at least for testing, but I
don't know how to have Nutch use these cookies with every HTTP
requests during its crawl.
I saw a reference to a "protocol-httpclient" plugin. Is that true / relevant?
Any help on configuring Nutch to use cookies for authentication would
be appreciated.
--
Thanks,
Yoav