Author Login
Post Reply
It looks to me like nutch doesn't handle pages with relative links. I have checked the FAQ and set outlinks to -1, but that makes no difference for my case.
<property>
<name>db.max.outlinks.per.page</name>
<value>-1</value>
<description>The maximum number of outlinks that we'll process for a page.
If this value is nonnegative (>=0), at most db.max.outlinks.per.page outlinks
will be processed for a page; otherwise, all outlinks will be processed.
</description>
</property>
Here's an example of a relative url on my intranet home page:
<a class=cbl1 href="/general/apps/feedback.nsf/$Control/view+Feedback+-+By+Date">View by date</a>
Is there something I should configure to handle these?
Thanks for any help.
Ed.
_________________________________________________________________
Win New York holidays with Kellogg’s & Live Search
http://clk.atdmt.com/UKM/go/111354033/direct/01/