Java Mailing List Archive

http://www.java2.5341.com/

Home » nutch-user.lucene »

relative urls

Edward Quick

2008-09-10

Replies: Find Java Web Hosting

Author LoginPost Reply

It looks to me like nutch doesn't handle pages with relative links. I have checked the FAQ and set outlinks to -1, but that makes no difference for my case.

<property>
<name>db.max.outlinks.per.page</name>
<value>-1</value>
<description>The maximum number of outlinks that we'll process for a page.
If this value is nonnegative (>=0), at most db.max.outlinks.per.page outlinks
will be processed for a page; otherwise, all outlinks will be processed.
</description>
</property>


Here's an example of a relative url on my intranet home page:
<a class=cbl1 href="/general/apps/feedback.nsf/$Control/view+Feedback+-+By+Date">View by date</a>

Is there something I should configure to handle these?

Thanks for any help.

Ed.




_________________________________________________________________
Win New York holidays with Kellogg’s & Live Search
http://clk.atdmt.com/UKM/go/111354033/direct/01/
©2008 java2.5341.com - Jax Systems, LLC, U.S.A.